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(54) Information access system and method for providing a personal portal 



(57) The present invention presents a system and 
method of providing information retrieved from a server 
(30) from across a communication network that has 
been personalized with private information that is not 
disclosed by the server (30). The document is retrieved 
from the server (30) via a standard document serving 
protocol channel. Before serving the document to the 
client, the private information is retrieved from a local 
storage device and merged with the information in the 



document to produce a customized document. In a pre- 
ferred erhbodiment of the present invention, the process 
is implemented by a proxy server or a client-side proxy 
server connected to a standard conventional browser. 
The proxy server can provide filters and a scripting serv- 
ice to make the process of customizing the information 
transparent to the normal document serving protocols. 
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Descripti n 

Reld of the Inventi n 

5 [0001 ] The present invention relates gen rally to information access systems, and, more particularly, to information 
access systems used to retrieve information from across a communication network. 

Background of the Invention 

10 [0002] As more resources become available on communication networks such as the Internet, rt has become 
IrKreasingly more difficult to locate, nnanage, and integrate them. Many information retrieval clients such as the brows- 
rs by Netscape, Miaosoft, and Mosaic have been introduced to make searching and retrieving information on the net- 
work more convenient and productive for end-users. Unfortunately, the closed architecture of such browsers has 
rendered the software design overloaded and monopolistic. Customizing the information retrieval process is difficult if 

15 not impossible. 

[0003] For example, it would be advantageous to personalize infornnation received from many centralized web sites. 
Recently on the Internet there has been a heated competition among several companies to establish "portals": websites 
that assemble an array of sendees and contents to attract internet users. The consensus is that there will be only a few 
portals through which mosX customers access services and products on the internet. Therefore, the most popular por- 

20 tals is likely to reap top advertisement dollars and E-commerce transaction fees. Existing portals such as "My Net- 
scape" (hti)yAnynetscape.com). "Excite" (htlp7/my. excite.com), "My Yahoo" (htlp7/my.yahoo.com), are loaded with 
similar features such as search engines, stock quotes, nnaps, weather information, and news. However, mosX of these 
services have become commodities that you can obtain through companies like Infbspace (httpy/wvwv.infospace.com). 
Portal builders have been busy creating new services in a hope to retain listing customers. Recent examples include 

25 AOLs "Instant Messenger" and Netscape's "Smart Browsing" and "WebMail". 

[0004] Many of these new services attempt to collect a user's profile information and detect the users presence on 
the Internet so as to create direct marketing channels to the user. However, none of these new services has the ability 
to confine the rich contents from the portal server with sensitive and private information on the dient side to provide a 
truly personal experience. For example, imagine that a user would like to get a list of stock quotes from the server and 

30 combine that with local stock portfolio information to connpute the net gain/loss and display the information on the per- 
sonal portal. The user may not want to give his or her financial details to the portal server, but this is required by all por- 
tal implementations. 

Summary of the Invention 

35 

[0005] The present invention presents a system and method of provkJing information retrieved from a server from 
across a communication network that has been personalized with private information tfiat is not disclosed by the server. 
The document is retrieved from the server via a standard document serving protocol channel. Before serving the doc- 
ument to the client, the private infornnation is retrieved from a local storage device and merged with the information in 
40 the document to produce a customized document. In a preferred embodiment of the present invention, the process is 
implemented by a proxy server or a client-side proxy server connected to a standard conventional browser. The proxy 
server can provkJe filters and a scripting service to make the process of customizing the information transparent to the 
normal document serving protocols. 

[0006] These and other advantages of the invention will be apparent to those of ordinary skill in the art by reference 
45 to the following detailed description and the accompanying drawings. 

Brief Description of the Drawings 

[0007] 

50 

Fig. 1 sets forth a diagram of an information access system and method illustrating an embodiment of the present 
invention. 

Fig. 2 sets forth an ecample of a system architecture illustrating an embodiment of the present invention. 
Fig. 3 sets forth a list of system commands that can t>e invoked using URL extensions. 
55 Rg. 4 sets forth a screenshot of an example service menu. 

Rg. 5 sets forth a list of commands that can be invoked using HTTP extensions. 

Fig. 6 sets forth a diagram illustrating th use of filters with an information access syst m arxi method. 

Rg. 7A. 7B and 7C set forth exanrples of cgi-tM'n filter programs. 
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Fig. 8 sets forth a list of scripting commands that can be used with a document pre-processing system. 

Fig. 9 sets forth arguments for use with a javabin class invoked by the "block" and lavabin" commands in Fig. 8. 

Fig. 10 sets forth a diagram illustrating th us of archiving with an information system and method. 

Fig. 1 1 sets forth an example of a grep class for use vtrith a walking facility. 
5 Fig. 12 sets forth a screenshot f an example cron menu. 

Fig. 13 sets forth a screenshot of an example archiving menu. 

Fig. 14 sets forth a dagram illustrating the use of multiple archiving repositories. 

Fig. 15 sets forth an example of scripting directives emfc>edded in an HTML comment. 

Fig. 16 sets forth a screenshot of an exanrple personalized portal. 
10 Fig. 1 7 sets forth a saeenshot of an example prior art portal displaying financial information. 

Fig. 18 sets forth a screenshot of an example personalized portal. 

Detailed Description 

15 [0008] Fig. 1 shows a diagram of a communications system which is suitable to practice the present invention. In 
the exemplary embodiment shown, a proxy server 30 is connected to a communication network tiirough communication 
link 200 to a gateway 90. The proxy 30 is shown shared by a number of clients 60 who are each executing an informa- 
tion retrieval program such as a browser. In an altemate embodiment, the proxy and the client browser can be executed 
as processes on the same client machine. Servers 70 provide information content to the clients 60 utilizing some doc- 
20 ument sen^ng protocol, such as the Hypertext Transfer Protocol (HTTP) as described in T. Berners-Lee et al., "Hyper- 
text Transfer Protocol - HTTP/1.0," RFC 1945, Network Working Group, 1996, wfhich is incorporated by reference 
herein. As used herein, a document serving protocol is a communication protocol for the transfer of information between 
a client and a server. In accordance with such a protocol, a client 60 requests information from a server 70 by sending 
a request to ttie server and the sewer responds to the request by sending a document containing the requested infor- 
ms mation to the server. Servers, and the information stored therein, can be identified through an identification mechanism 
such as Uniform Resource Locators (URL), as described in detail in T Berners-Lee et al.. "Uniform Resource Locators," 
RFC 1 738. Network Working Group, 1 994, which is incorporated herein by reference. In an advantageous en^odiment, 
the network is the Internet and the sewers 70 are W^ sewers. 

[0009] Proxy sender 30. as is known the art, has a processor 140, memory 130 and a non-volatile storage device 
30 1 50 such as a disk drive. The memory 1 30 includes areas for the storage of. for example, computer program code and 
data, as further described below. The proxy server 30 is shown executing two processes: an access sender 40 and a 
web server 50, both of which are further described in the pending utility patent application. "Information Access System 
And Method." Serial No. 08/994,600. filed on December 19, 1997. The access sender 40 behaves like a typical proxy 
sender and accepts document requests (in a preferred embodiment, using standard TCP ports and HTTP) and routes 
35 them to other proxies or the desired server. The buift-in server 50 is designed to act as a function execution engine 
rather than just an information provider. 

[0010] Rg. 2 illustrates an exenrplary system architecture, adapted for HTTP and implemented as Java classes, 
providing the functionality necessary to practice an enrtKxiiment of the present invention. When tiie proxy application is 
started, the main thread 2010 listens on the proxy port and receives and responds to HTTP requests from clients 60 

40 (browsers or other proxies). The system creates a new User Thread 2020 to sen^e each new request The isen^er dass 
2025 parses the client^s request and fonwards commands to iagent 2035 for remote web access or to ihttpd 2026 for 
localhost access. The iserver dass 2025 also implements various protocol extensions further discussed below in Sec- 
tion 1. The ihttpd dass 2026 implements the web sender described above and wrill return an external file writh the http 
header or execute a local CGI saipt to generate the replying message online. If the content is in a special scripting for- 

45 tnaA described in furtiier detail below in Section 3. icmd 2028 will be invoked to parse the script and interpret and exe- 
cute any commands embedded in tiie document page. 

[0011] The Iagent dass 2035 of the agent thread 2030 connects to a remote web server 70 or proxy to request a 
Web page, iagent 2035 can cache the page and return the page to iserver 2025 or to ihtwalk 2045, a fadlity to walk tiie 
html free structure to collect and archive pages. The ihtwalk dass facility is further described below in Section 4 in the 
50 desaiption of the walking facilities. 

1. Protocol Extensions 

[001 2] The URL and HTTP protocols can be extended to indude additional system commands that advantageously 
55 permit conventional information retrieval clients to be utilized with embodiments of the present invention. This can be 
accomplished in a manner that is transparent to the client browser application and, thus, requires no new user interface. 
[001 3] As an illustration of an example format for a URL extension is: 
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URL? iproxy&command 

where "? iproxy" is a keyword, and ''cx>mmand" is a service that might be applied to a given URL or a system command 
that is irrelevant t th given URL For example, the command coM invoke a archiving servic (as described in further 
5 detail below in Section 5) or can be used to tailor system fundi ns. The iserver class 2025 parses the commands and 
fonwards the command to ihttpd 2026 which, in turn, fonvards the command to icmd 2028 for processing. Rg. 3 sets 
forth a variety of system commands that can be defined using URL extensions. 

[0014] In another embodiment of the present invention, a URL extension can be used to invoke a dynamically gen- 
erated menu of entry points lor available services and present it to the user's browser when requested. For example, 
w th URL naming protocol can be extended by introducing a new naming scheme: 

URL?? 

where the double-question mark "??" is a trigger to invoke a service menu. The service menu can be implemented to 
75 inherit the same cookies as the current URL. The proxy sut>system intercepts the request from the browser with the 
above request pattern, generates a menu on-the-fly and returns the menu to the browser as an HTML file. Rg. 4 shows 
an illustrative example of a menu obtained after a user browses a webpage. here "httpV/www.interac- 
tive.wsi.com/pagesftechmain.htmr, and types "??" after the URL and presses enter. The menu platform defines 
entries a number of services and user-defined macros on a per URL-set basis. For exanple. Fig. 4 permits the user to. 
20 inter alia, change cache policies, archive or prefetch pages in the html tree, or search the page hierarchy for a keyword. 
The menu system chooses the proper menu description based on the current URL. 

[001 5] The menu service has nunnerous advantages. It does not introduce a new graphical user interface, but rather 
presents the menu in HTML content shown in normal browsers. It preserves user's cookies and is easily extensit)le. 
New service entries can be readily plugged into the menu. The concept of the service menu Is akin to pushing special 

25 keys during a traditional telephone conversation, for example, "##" - the network then places the line on hold and 
announces a service menu for selection; after menu selection, the line is placed back to the original conversation. 
[0016] New commands can also t»e introduced to HTTP for special communications among multiple proxy servers 
to, for example, estat>lish special connections (like persistent channels) and/or perform value-added sendees (like TCP 
fonwarding). For example, the commands set forth in Rg. 5 can be defined for such communications and used to extend 

30 the corwentional HTTP commands. 

2. Rtters 

[0017] Support is provided for the processing of data by fflters. Fig. 6 illustrates how f flters can be applied to http 
35 headers, pages returned from the web. and pages returned from a local cache. For exanple, an input fitter can be used 
to add new components (menubars, etc.) or nxxJify returned pages (replace some remote data with local data. etc.). 
Data from servers can also be corxJensed, compressed, encrypted, patched, etc., prior to the corresponding filters con- 
verting it back into its original format before returning it to a user. 

[0018] The filter functions shown in Rg. 6 can be specified in a configuration file for the proxy's built-in web server. 
40 The function can be specified with a corresponding URL pattern, such that the filter is applied to URLs matching the 
pattern. For example, the following entries: 

InputFilter /bin/ f pack . cgi littp : / /* /resume • html 

45 

OutputFilter /bin/f xmpack, cgi http : //*/resume . html 
HeaderPil ter /bin/ forward . cgi http : / /www . at t . com/ * 

50 

specify cgi functions for the indicated filters and URL patterns. As indicated in the HeaderFilter entry, all HTTP calls for 
the web server www. att com will be processed by the cgi function forward.cgi before being sent out to the remote web 
server (or proxy server), and so on. Filters can also be specified using an extended URL command. For example, the 
system can be configured to calls fitters with the arguments: 

55 

http://localhost^avabin/inputfllter7url=cached„url&path=cacheJile 

http*y/localhost/javabin/outputfilter?ur1=cached_url&path=cache_file 

httpy/localhost^avabin/headerfiHer?htheader=header_lines 
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[001 9] niter programs can be implem rrted as cgi-bin programs and inherit the cgi-bin API. Every filter programs 
irT^)lemented as a Java dass will create an instance when inv ked. Rgs. 7A, 7B, and 7C show exampi cgi-bin filter 
programs. Fig, 7A shows an inputf ilter, the Execute method reading the "cachej ile" and updating the f il if necessary. 
In Fig. 7B, oulputf ilter uses "cachej ile" as input and sends th result to OulputStream which will be sent back to the 
5 original caller. Fig. 7C shows headerf ilter which gets the original http header (multiple lines without the empty line) from 
Hashtable, and outputs a new header to OutputStream. 

3. Scripting Facilities 

10 [0020] In addition to Java dass invocation mechanisms and standard scripting mechanisms such as Ksh and Perl, 
it is advantageous to provide support for document pre-processor scripting. Proxy-side scripting allows the different 
system components to be integrated in a light-weight manner that has a syntax and a semantics that is very HTML-like. 
Scripting provides a method for plugged-in sennces/functions to access built-in functions, server status and data struc- 
ture, and the cache. A scripting proxy preprocessor can also provide a communication media for nrwltiple proxy sen/ers. 

75 [0021] The inventors devised a scripting language providing extra macros statements based on the standard doc- 
ument markup language HTML. The proxy server pre-processes the scripts and turns the comments into pure HTML 
In a preferred embodiment of such saipting. it should include support for variable declarations, conditional statements, 
sets of built-in functions, and interfaces to invoke cgi-bin. The script can be a standard HTML-like plain text file tfiat con- 
tains the macro statements. In order to identify the document as a script, an identifier should be induded; for example, 

20 the first line in tiie script can start with: 

# ! / iproxy/script 

The arguments of a cgi-bin can be accessed using "$ {arg}" inside tiie script file. Fig. 8 sets forth a list of built-in state- 
25 ments tiiat are useful for scripting. When using the "iblock* and "ijavabin" comnnands, the script will invoke a javabin 
dass with the arguments set forth in Rg. 9. 

4. Walking Fadlities 

30 [0022] One useful fadlity is a tesic function supporting a mechanism to walk through document page hierarchies. 
This is similar to what lind" and W (file tree walk) do on Unix file systems. The walking action is specified by a root 
URL where the action starts, a specification of how many levels tiie action will visit, and certain additional properties 
such as whether or not image files should be induded. For each visited page, one or more of a list of functions can be 
invoked one by one to perform tasks on a cache of the page. Examples of such functions include functions for archiving 

35 the web pages, searching for keywords, and creating index tables. 

[0023] See Fig. 10 forte structure of a walking function, "^alk". designed in accordance witti a preferred embod- 
iment of the present invention. The syntax for the walking function is given in Fig. 3. For example, for the URL 

htlp7Awww.attcom/?iproxy&htwalk=3, -local, -image,archive,grep=Cable 

40 

the system walks down 3 levels for pages referred (directly or indirectiy) by www. attcom, including image files, but only 
for those pages on the kx»l sender. For each visited page, the system calls tiie function archive to archive tiie page and 
the function grep to search for the keyword "Cat>le" in the page. For each visited page, the system calls the fbllcwing 
cgi-bin programs one-by-one: 

45 

/bin/archrve.cgi?url=visited_page&args=int_no, -local&patti=cached_file &htwalk=arch'ive 
/bin/grep.cgi?uri=visited_page&args=int_no. -local&patii=cached_file &htwalk=grep&walkDpl=Cache 

Rg. 1 1 sets forth an example of a grep dass used to implement the walking function. 

50 

5. Archfvina Service 

[0024] The URLs processed by ttie proxy 30 can be extended to indude archive directives that add data to a stor- 
age repository. e.g. devkie 1 50, and for retrieving the archived data. The new commands are intercepted and performed 
55 by the proxy server. Because a proxy is used as a middleman between the browser and ttie web servers, the new 
archiving services are just plug-in components and do not interfere witti existing components and protocols. 
[0025] As an advantageous example, the URL naming scheme can be extended to indude URLs in the following 
format: 
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http7/view@host/path 

where the "Vjew" is a date, for example in the format yyyymmdd. when the corresponding page (i.e., httpV/host/path) 
has been retrieved from the original web server and stored into the archive repository. The timestamp can be used as 

5 the key to locat th page from the repository. For example, http7/1 9980701 ©www.att.conV would r fer to the page 
httpy/www.att.com that was retrieved and archived from the www.atLcom web server on July 1 , 1998. 
[0026] More advanced features can be implemented to allow specifying an action in frorrt of the date for locating an 
alternative page on the archive server If the dated page does not exist. TTie system first locates the page archived on 
that date. If the desired page is not found in the repository, the system then searches for the page in the repository 

10 before or after the date and returns the first found page. For example, the following syntax for naming archived pages 
can be used: 

15 Thus. htlpy/4date@host/path would choose the first page found after the specified date; on the other hand, htlp://- 
date@host/path points to the first page before the date. For example. ht!p7/+1 9980701 @www.alt.com points to the first 
page that was archived on July 1 . 1998 or after, while httpVZ-l 9980701@www.atl.com is for the page archived on July 
1. 1998, or before. As shown in the syntax, the granularity of View" can be readily taken down to the second level of 
hours, minutes, and seconds. 

20 [0027] The above naming scheme advantageously is compatible with the conventional URL protocol, which 
defines network resource naming as protoy/[user[password]@]host/|>ath. For example, the URL 
ftp://foo:bar@ftp.research.att.conVREADME points to a README page on the FTP server ftp.research.attcom. while 
accessing on behalf of the user too with the password bar. The user/password portion of the URL protocol is undefined 
in the HTTP protocol; accordingly, the above naming scheme can take advantage of this for archive naming. 

25 [0028] Multiple methods can be used to invoke the archive service and store data in the repository: 

Command Extensions . Using any of the methods described atxive under Section 1 . a user can irrvoke the archiving 
service. For example, the following URL: 

30 httpy/www.research.attcom/iPROXY.html?iproxy&action=archive 

can be used to cause the system to archive the page 

httpy/www.research.attcorn/iPROXY.htiTil. 

35 

Scheduled Archiving. The system can also t>e extended to contain a server that executes an archiving task at a 
designated time. e.g. like aon command in Unix systems. The walking facilities described atxTve can be used to 
"walk" through a web site for a set of HUAL pages. As desaibed above, the walking can be defined by a root URL 
and parameterized by (a) the depth of walking through hyper-references urxjer HTML pages, (b) with or without 
40 image files embedded in HTML pages. arxJ (c) walking through pages on the local web site or on all web sites. Rg. 
12 shows an example page interface for a cron cgi program which can be used to add a cron job to archive a web 
site. Users can schedule an archive task on a daily or weekly basis. 

Transparent Archiving. The system can support a function of archiving selected web pages whenever they have 
been accessed automatically. The specification can be done through the CacheFlag of a cache command, which 
45 specifies the caching policy: 

CacheFlag Archive URL-expression 

Whenever a requested URL matches the URL-expression, the system puts the data in the archive repository. 
50 Archiving Browser Cache. The system can support a function for scanning data cached on a browser's cache area 
and archiving them into the repository. 

[0029] As for retrieving information from the repository. Rg. 13 shows how an example saeenshot of an interface 
that can be used to browse the archived information. Each month's data of each website can be stored in a repository 
55 structure similar to a Unix-like pax archive. A cgi program creates a page on-the-f ly based on the contents of the archive 
and creates hyperlinks for each wet>site name. The hyperlinks are listed for each month when the certain pages of the 
website were archived. For example, the AT&T website has a cron job to archive the web pages on a daily basis, so the 
data in Rg. 13 has all twelve months in 1998 listed as hyperlinks. 
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[0030] Muttple archive reposit ries can take advantage of the abcve-mentioned features for irrt r-proxy communi- 
cation to shar the burden of storing the information. For example, Fig. 14 shows hew muftiple archive repositories can 
coIlakx)rat in the archiving task. Proxy serv r in Fig. 14 queries another repository in order to answ r a user request. 
As shown in Fig. 14, the following advantageous web interface can be used to search the repository or repositories: 

5 

httpy^roxyhost/bin/igetA^iew@host/path 
A repository search path variat)le can be set using a command such as: 
10 RepositoryPath=Host:PortI;Hosl:Port]* 

Thus, multiple repositories may be easily addressed and accessed to quickly locate the desired information. 
6. Packapinq Service. 

75 

[0031] A packaging service can be provided which would allow the browsing of document pages from portable 
packages. The package is portable in that it can be possible to attach one to an e-mail or to copy one from one machine 
to another. To create a package, the walking facility may be used to visit the sets of web pages that are designated to 
be packed. Thus, the system woukJ accept (a) a root URL. (b) a parameter expressing the depth of the walking through 

20 the root URL's direct and indirect references, (c) an image option to decide whether or not to include the image files 
when walking through the web pages, (d) a reference filter option (for accessing all references or filtering out some of 
them), (e) mechanisms for storing cookies needed for accessing the root URL, and (f) a package name. The system 
then walks through the set of web pages rooted by the designated URL and packs them into a package using the des- 
ignated name. The package can use any of the number of known conventional compression formats to package the 

25 data. The package is advantageously self-contained, including two physical files: one for contents and another for 
indexing. Each package can maintain its own index table. 

[0032] In order to browse the package, the proxy system can support functions to access the web pages stored in 
the package as if they were from the original web sites. The packages are stored on a local disk, and browsing does 
not require a network connectbn. When browsing, the server uses the above-mentioned index table to locate con^e- 
30 sponding content in the package. The server that generates the package may be different from the one that browses 
the package. The cookies used to traverse the web pages are automatically handled by the system. 
[0033] New documents can easily be appended to an existing package. A single package may. in fact, contain more 
than one root URL Each root URL in the package can be chosen as an entry point to browse the package. 

35 7. Per sona liz e d Services 

[0034] It is notable tat the proxy system described above can have access to a user's private informatton such as 
the web access history and sensitive financial information. A user may not be comfortable with providing this informa- 
tion to a server across the communication network The same information can be stored in a safe locatbn locally, on 
40 the client machine where the proxy has been integrated on the dient-side or on a local centralized proxy closer to the 
user. The present invention permits several new personalized services to be provkJed. 

[0035] The proxy can be used to integrate in an automatic manner the information provided by a typical portal with 
sensitive personal information. For example, a user issues a request for the following URL: 

45 http^Mww.att net/?iproKy&action=portal 

The portal command woukl cause the proxy server to retrieve tiie honie page from www.att.net which has t>een 
encoded with scripting directives that instruct tiie proxy server how to process the local data and merge it with the 
server contents. It then presents the personal portal page back to the user. In order to provide scripting tiiat is non-intru- 

50 sive to other users who are not using the present invention, tiie scripting directives can be embedded in l-ITML com- 
ments as shown in Rg. 15. The proxy server intercepts the directives in Fig. 15 and performs the necessary actions 
before returning the sewer portal page to the browser. All the directives, being embedded in an HTML comment, are 
ignored by browsers not using a proxy sewer configured in accordance with the present invention. 
[0036] The following are some illustrative examples of how to use the capabilities of personalizing Information 

55 retrieved from the public servers. They are merely examples and are not meant to limit the nature of the present inven- 
tion: 

Personal Web Page Reminders and Hot Sites. Since the proxy server can log a user's web accesses, it can analyze 
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the log and use the data to provide new web services that can improve the user's browsing experience. Rg. 16 
shows, for exanple. two possible services: (1) "TD-READ" homepages. A user can specify a list of websites and 
corresponding frequ ncies when certain websites should b visited. The proxy can check the last visiting dates and 
schedule a list of pages that the user should visit today. (2) "HOT Sites." The proxy sew r can compute the nurrtoer 

5 of visits to each website and list the top ten wet>sites with their last visiting dates accordingly as soon as the user 
accesses his/her personal portal. There are several advantages of having such a list. The user may want to know 
the last visiting time of a favorite site - the timestamp can be displayed along with the website link. The user may 
wish to access the latest version by clicking on the link. The user may wish to compare the new version with the old 
using some differencing tool, especially since the previous versions of the web page can be archived as described 

70 above. Fig. 15 shows the directives used to construct the page displayed in Fig. 16. The directive lo-read" con- 
structs a list of web pages scheduled to be read; the directive "dotog" analyzes the current web access log to pro- 
duce the statistics need for the next directive, 'lop 10" which presents the results on the personal portal. 
Personalized Hnancial Page. Most portals currently allow users to specify the stocks that they are interested In and 
display the latest stock price when a user accesses the personalized page. See Rg. 1 7 which shows a typical AT&T 

15 WorldNet portal showing various stock quotes. However, the portal cannot compute your current balance or net 
gain/loss unless you provide private and sensitive information like how many shares you own and when you pur- 
chased them. Most users would not like to provide such infornration to their portal page server. In accordance with 
an embodiment of the present invention, such personalized information such as real purchase price, commission 
fees, and the number of shares of each stock can be stored and accessed by the proxy server. By constructing an 

20 output filter for the stock page, the proxy sender can retrieve the private information, combine it with stock quotes 
provided by the portal site to compute the balance, net gain/loss, and other interesting personalized informa- 
tion/statistics. Rg. 18 shows the same view as Rg. 1 7 personalized with the user's information. This can be accom- 
plished by a specification like the following in the proxy configuration file: 

25 OutputRKer /biny|portfolio.cgi http://stocks.planetdirect.com/tportfolio.asp 

This instructs the system to apply the Java class "portfolio" as an output filter whenever the browser issues the cor- 
responding http request. The numbers in Rg. 18 are visfole only to the client and not the original portal server. The 
user is shown as having bought 267 AT&T shares. 50 Netscape shares, and 40 E*Trade shares at the respective 
30 prices of $36.50, $21.0, and $39.38. each. The convnissions were 0, $19.95. and $19.95. The total gain was 
$17,799.02. The numbers replaced by the proxy server are shown in a different shade. 

Personal Web /Vrchive. While current search engines allow users to find pages of a certain topic easily, tiiey do not 
offer much help in looking and viewing the pages a user has seen in the past, except for those that are still kept In 
the browser cache. Due to the sharp decrease in storage costs, a client-side proxy can afford to archive all the web 

35 pages a user has seen so that any of these pages can be retrieved easily later on - without even bookmarking them. 
Existing wetipage search tools such as Alta Vista Discovery can be used to index the web archive and search tiie 
pages. A user can then quickly conduct a search of all pages he/she has seen in the last year, for example. As 
desaibed in the archiving section above, the instant system can intercept http requests and effectively extend ttie 
URL name space to address pages stored in the archive by adding a timestanrp m front of the regular http address. 

40 Even as the web pages go tiirough major redesigns, the original pages can be accessed using the archive exten- 
sions to give the same content. 

[0037] The foregoing Detailed Description is to be understood as being in every respect illustrative and exenrplary, 
but not restrictive, and the scope of the invention disclosed herein is not to be determined from the Detailed Description, 

45 but rather from the dainns as interpreted according to the full breadth permitted by the patent laws. It is to be understood 
tat the embodiments shown and described herein are only illustrative of the principles of the present invention and that 
various modifications may be implemented by those skilled in the art witiiout departing from the scope and spirit of ttie 
invention. For example, the detailed description has been described with particular emphasis on the Internet standards 
of HTTP, URL's, and HTML However, the principles of the present invention could be extended to ottier protocols for 

50 serving information from a server to a dient. Such an extension coukJ be readily implenrtented by one of ordinary skill 
in the art given the above disdosure. 

Clalnfis 

55 1 , A method of providing access to information stored at a server corrprising the steps of: 

establishing a document serving protocol channel to a server; 

receiving a document from the server via the document serving protocol channel; 
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retrieving local information; 

customizing the document to reflect the local information before transmission of the document to the user. 

2. The method of claim 1 wherein the customizing step is based on scripting commands embedded in the document. 

5 

3. The method of claim 1 wherein the customizing step is based on a f Ifter which specifies wfiich documents retrieved 
from which servers should be customized and how they should be customized. 

4. The method of claim 1 wherein the local information is stored on a local storage device. 

10 

5. The method of claim 1 wherein the local information is private user inlomnatbn. 

6. The method of claim 1 wherein the local information is user access information. 

15 7. The method of claim 1 wherein the document serving protocol is the hypertext transfer protocol. 

8. The method of claim 1 wherein the document is a web page. 

9. The method of claim 1 wherein the server is a web server. 

20 

10. A computer readat)le medium containing executable program instructions for performing a method on a computer 
connected to a communication network comprising the steps of: 

establishing a document serving protocol channel to a server; 
25 receiving a document from the server via the document serving protocol channel; 

retrieving local information; 

customizing the document to reflect the local information before transmission of the document to the user. 

1 1 . The computer readat)le medium of claim 1 0 wherein the customizing step is based on scripting commands embed- 
30 ded in the document. 

1 2. The computer readable medium of daim 1 0 wherein the customizing step is based on a filter which specifies which 
documents retrieved from which servers should be customized and how they should be customized. 

35 13. The computer readable medium of claim 10 wherein the local information is stored on a local computer readable 
storage device. 

14. The computer readable medium of claim 10 wherein the local information is private user information. 

40 15. The computer readable medium of claim 1 0 wherein the local information is user access information. 

1 6. The computer readat)le medium of claim 1 0 wherein the document serving protocol is the hypert^ transfer proto- 
col. 

45 17. The computer readable medium of claim 1 0 wherein the document is a web page. 

18. The computer readat>le medium of claim 10 wherein the server is a web server. 

19. The computer readable medium of claim 10 wherein the network is the Intemel 

50 

20. A proxy server comprising: 

a first interface for establishing a document serving protocol channel to a server; 
a second interface for establishing a second document serving protocol channel to at least one client; 
55 a storage device for storing local information; 

a processor adapted to customize documents received from the server with the local information k>6fore trans- 
mission back to the user. 
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21 . The proxy server of claim 20 wherein the customizing step is t>ased on scripting commands embedded in the doc- 
ument. 

22. The proxy sen/er of claim 20 wher in the customizing step is t>ased on a filter which specifies which documents 
5 retrieved from which servers should be customized and how they should be customized. 

23. The proxy server of daim 20 wherein the local information is private user information. 

24. The proxy server of daim 20 wherein the local information is user access information. 

10 

25. The proxy server of daim 20 wherein the document serving protocol is the hypertext transfer protocol. 

26. The proxy server of daim 20 wherein the client is a browser. 

75 27. The proxy server of daim 20 wherein the document is a web page. 
28. The proxy server of daim 20 wherein the server is a web server. 
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FIG. 2 



.1 



2010 



mon 'Proxy Main Thread (ipmain) 
202C^ ■ ^2030 




12 



EP1039396A2 



FIG. 3 
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FIG, 4 
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FIG, 6 
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FIG. 7A 

public class inputPilterSample extends javabin | 

public Object Execute(OutputStream os,HasMable ht)throws lOException } 
String uri = (String)ht.get(''urr); 
String fname = (String)ht.get("path**); 
byte orig_date[] = readCachenie(fname); 
byte result_data[] = dolnputFilter(orig_data); 
writeCacheRle(fname, result.data); 

I 

I 



FIG. 7B 

public class outputFilterSample extends javabin | 

- public Object Execute(0utputStr6am os.Hashtable ht}throws lOException \ 
String uri = (String)ht.get(*'uri''); 
String fnanr^e = (String)ht,get(''pafh''); 
byte cached_date[] = readCacheFile(fname); 
byte result_data[] = doOutputFilter(orig_data); 
os.write(result_data); 

\ 

I 



FIG, 7C 

public ctoss headerFilterSample extends javabin { 

public Object Execute(OutputStream os,Hastitable ht)ihrows lOException | 
String orig_tieader = (String)ht.get("ht.header'); 
String result_header = doHeaderrilter(orig_header); 
os.write(result_header.getBytes()); 

I 

1 
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FIG, 10 
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FIG. 11 

public class htwalkGrepSample exents javabin { 
public void initProcQ { 
// init process 

} 

public void finQlProcQ | 
// find process 

public Object Execute (OutputStreom os.Hashtable tit)ttirows IOException| 

initProcQ; // init process 
String titcmd = (String)tit,get("titcmd''); 
String uri = (String)tit.get(''uri"); 
String orgs = (Strin g)ht.get("args"); 
String fname = (String)tit. get( "path"); 
String key := (String)ht.get("walkopr); 

finalProcQ; // final process 

if (keywordMatched(fname, key)) { 

ShowGrepResult(uri, fname); 

I 

I 
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FIG. 12 
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FIG, 13 
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FIG, 16 
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FIG, 17 
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